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Abstract 

Following the approach of Yu, Singh, and Krakaucr [Phys. Rev. B 43 (1991) 6411] we extended 
the linearized augmented plane wave code WIEN of Blaha, Schwarz, and coworkers by the 
evaluation of forces. In this paper we describe the approach, demonstrate the high accuracy 
of the force calculation, and use them for an efficient geometry optimization of poly-atomic 
systems. 
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PROGRAM SUMMARY 

Title of program extension: fhi95force 
catalogue number: ... 

Program obtainable from: CPC Program Library, 
Queen's University of Belfast, N. Ireland (see applica- 
tion form in this issue) 

CPC Program Library programs used: cat. no. : ABRE; 
title: WIEN; ref m CPC: 59 (1990) 399 

Licensing provisions: none 

Computer, operating system, and installation: 

• IBM RS/6000; AIX; Fritz-Haber-Institut der 
Max-Planck-Gesellschaft; Berlin. 

• CRAY Y-MP; UNICOS; IPP der Max-Planck- 
Gesellschaft; Garching. 

Operating system: UNIX 

Programming language: FORTRAN?? 
(non-standard feature is the use of ENDDO) 

floating point arithmetic: 64 bits 

Memory required to execute with typical data: 
64 Mbyte (depends on case) 

No. of bits in a word: 64 

No. of processors used: one 

Has the code been vectorized? no 

Memory required for test run: 64 MByte 

Keywords 

density functional theory, linearized augmented plane 
wave method, LAPW, supercell, total energy, forces, 
structure optimization, molecular dynamics, crystals, 
surfaces, molecules 

Nature of the physical problem 

For ab-tnitto studies of the electronic and magnetic 
properties of poly-atomic systems, such as molecules, 
crystals, and surfaces, it is of paramount importance 
to determine stable and metastable atomic geometries. 
This task of structure optimization is greatly acceler- 
ated and, in fact, often only feasible if the forces acting 
on the atoms are known. 

The computer-code described in this article enables such 
calculations. 

Method of solution 

The full-potential linearized augmented plane wave 
(FP-LAPW) method is well known to enable accurate 
calculations of the electronic structure and magnetic 
properties of crystals |, |, H, |, |, §, B- Within 
the supercell approach it has also been used for stud- 
ies of defects in the bulk and for crystal surfaces. For 
the evaluation of the atomic forces within this method 
we follow the approach outlined by Yu and coworkers 

§. In order to minimize the total energy as a function 
atomic positions we employ a damped Newton dy- 
namics scheme or alternatively the variable metric 



algorithm of Broyden et al. Several ap- 

plications of this approach to chemisorption at surfaces 
have already been published |l4| |l^] . 

Restrictions on the complexity of the problem 
Inversion and orthorombic symmetry of the elementary 
cell is required. 

Typical running time 

The additional force calculation increases the running 
time of a typical self-consistent total energy calculation 
by 5-10%. 
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LONG WRITE-UP 



1 Introduction 

The augmented plane wave (APW) methods |l|, ^, ^ and in particular its linearized form, 
the LAPW g, 0, |, H, |1^, |12|, |T3|, 0], enable accurate calculations of the electronic and 
magnetic properties of poly-atomic systems from first principles. One successful implementation 
is the program package WIEN. This full-potential LAPW (FP-LAPW) code developed by Blaha, 
Schwarz and coworkers [^] has been successfully applied to a wide range of problems 
and systems such as complex crystals |17], transition metal surfaces [18|, and molecules (see the 
H2 test case in this paper). 

The main output of the WIEN code is the total energy for a given atomic arrangement. Using 
only this quantity the minimization of the total energy of a poly-atomic system is a costly and 
often impractible task. However, the situation is changed if the forces which act on the different 
atoms are available. Only recently, force formulations within the LAPW method have been 
introduced and tested by several authors ||l9|, We followed the approach of 

Yu, Singh, and Krakauer (YSK) pO] and implemented the direct calculation of atomic forces 



into the original |13] and the WIEN93 version [25| of the WIEN code. The obtained forces are 



highly accurate and can be used in an efficient minimization scheme to optimize the geometry 
of poly-atomic systems. 

The remainder of the paper is organized as follows. In Sec. II we summarize the most 
important features of the YSK force formalism. Section III describes the energy minimization 
procedure. In Sec. IV results of test calculations are presented, and, finally. Sec. V and VI 
describe the structure and the installation of our program package fhi95f orce. 



2 Evaluation of Forces 

2.1 Forces within Density- Functional Theory 

Within density-functional theory the ground state total energy is given by the minimum of a 
total energy functional with respect to the electron density n{r) 

E^°^[n] = T[n] + U[n] + E'"'[n] (1) 
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where T[n], U[n], and £'^'^[n] represent the functionals of the non-interacting many electron 
kinetic, the electrostatic, and the exchange-correlation (xc) energy, respectively. The electron 
density n'^'^(r) which minimizes £'*°*[n] is found by solving self-consistently the Kohn-Sham (KS) 
equations ^] 

HMr) = [f + V''''{r)]Mr) = e^M^ (2) 

where T is the single-particle kinetic energy operator. Throughout the paper we use Rydberg 
atomic units. The effective potential y^(r) is given by 

y<=ff(f) = y^«(f) + F"=(f) (3) 

where 

V^^ir) = [ d'r' -$0^ - V (4) 
J \r-r'\ ^ \r-Ri\ 

denotes the total electrostatic potential created by the electron density 

oo 
i=l 

and the nuclear charges. The quantities fi are the occupation numbers of the eigenstates ipi^r). 
The I-th nucleus is positioned at Rj and carries the charge Zj. The xc potential y^'^(r) is 

The KS total energy E^°^ [n] is then calculated by using the expressions 

00 „ 

i=l 

1 f .,3^^s^f^ir)n{f') /■ ,3-,^/-;^ 1^ ZjZj 



U[n] = - fd'rd'f'^^^^ 

2 7 \f-r'\ J ^'^\f-Rj\ 2f^j\R,-Rj\ 

^^^[n] = / dVn(f)e^^ [n] (r) , (7) 



where e^'^[n] is the xc energy per particle. For finite temperatures, or in order to stabilize the 
convergence of the self-consistent calculation the electronic states may be occupied according to 
a Fermi distribution at a non-zero electron temperature T*^' 

-1 



(8) 
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where ep and ks are the Fermi energy and the Bohzmann constant. In this case, one has to 

(9) 



minimize the free energy |28, 29 1 



el eel 



with the entropy S"^' given by 



5^=1 = -2kB [fi log fi + a- h) log(l - /.)] - So . 



(10) 



Here, 5*0 is chosen such that the entropy S'^^ vanishes for T'^^ = K. The force on the /-th nucleus 
is defined as the negative derivative of the free energy with respect to the nuclear coordinate 
Ri: 



Fi 



dRi 



(11) 



It is evaluated by displacing the respective nucleus by a small amount ARj and calculating the 
resulting first-order change of the free energy A!F. For the different energy terms in eq. (0) we 
obtain the following first order variations 

AT[n] = ^fi^i + fi^^i - / dVAn(f)y'='^(r) - / d^fn{r)AV^^{r) 
i=i i=i •' '' 

AU[n] = J dVAn(r)F°'(r) - Fj^'^ARj 

A^^^[n] = y"dVAn(r)y^'=(f) . (12) 

The first term A/jej in AT[n] is canceled with the contribution from the variation of the 
entropy S^^ in eq. ( pTj ) |30|. The Hellmann-Feynman (HF) force Fj^^ describes the classical 
electrostatic force exerted on the /-th nucleus by all the other charges of the system (electrons 



and nuclei) [pl| , 32] and is obtained from 

Fr = Z,V,-yr(r)| 
where Vj^{r) is the electrostatic potential 

y/«(f) = V^'ir) + — 



r=Ri 



Zi 



(13) 



(14) 



\r-Ri\ 

felt by the /-th nucleus. Taking the definition of the effective potential in eq. (^) into account 
the force on the /-th nucleus is then given by 



Fi = Fj^^ - 1^ fiAei - J dVn^=(r)Ay^*^ 



(f) 



(15) 
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It should be mentioned that the free energy variation and thus the forces Fj are invariant to 
any first order deviation An(r) from the self-consistent density n^^{f). Thus, force expressions 
different from eq. ( p!5| ) may be derived if this variational freedom is used, e.g. if the electron 
density is shifted rigidly with the nuclei ||3^, |3|] . 

2.2 Basis Set Corrections to the Hellmann-Feynman Force 

Usually, the HF force F]^^ can be evaluated quite easily using eq. (0). The second term in 



eq. (15) describes the so-called Pulay forces pq| . The explicit expression of this correction to 
the HF force Fj^^ depends on how the KS equation is solved. One usually expands a KS 
wavefunction tp = tpi{r) at the eigenvalue e = ej linearly using a set of basis functions (p^: 

^ = Y,Cu(pu ■ (16) 

With this variational basis functions, the KS equation becomes the following secular equation 

5] (i7^, - eO^,) a = , (17) 

or equivalently 

Y.Cl{R^,-eO^,)C^ = ^ . (18) 
The Hamilton and overlap matrix elements are defined as 

H^, = {^^\R\<i>u) (19) 

0^,u = k<PMv) . (20) 

Both eqs. ( |T7|) and (|l^) also hold for a shifted atomic configuration Ki -|- A/?/. Thus, we obtain 
for the latter 

^ (pi + AC;) + Ai/^, - (e + Ae) (O^,, + AO^,)] (C, + A^) = . (21) 

The variations of the matrix elements are as follows: 

AO^, = (A</)^|</.,) + (0^|A0,) = 2Re(A0^|0,) (22) 
A//^, = 2Re(A0^|//|0,) + (</.^|Af + Ay^ff|0,) . (23) 

If we allow only first order changes in A^/ and take into account eq. ( p^ we arrive at the 
following simplified form of eq. ( |2l| ) 

^C;C,(AF^,-eAO^,-AeO^,) = . (24) 
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Employing the normalization conditions 



C*^Cy0^y — 1 



(25) 



eq. (^4|) may be transformed to an expression for the linear change Ae of the KS eigenvalue e. 
We obtain 



Ae,: 



J2 C;C, [2Re(A0^|F - + (</>^| AT + AV 



cffi 



111/ 



and finally rewrite the second part of eq. (15): 



. Note that we use the relation 



2Re{^\H-e, 
dRi 



+ 



dT 
dRi 



efTi 



ifii/ 



d''fn{r)AV''^{r) 



(26) 



(27) 



(28) 



In eq. (^) the first term within the square brackets is called incomplete basis set correction 



|39| . Its existence was first noted by Hurley [40|. The respective sum vanishes if the 
basis functions are independent on the atomic positions or if their first order changes A^^ lie 
completely within the subspace described by the original basis set {(j)r]}, i-e., 

|A</)^) = ^ I A<A^) . (29) 



Then, eq. ( |18D can be applied to eq. ( |27D and the incomplete basis set correction vanishes explic- 
itly. This is for example the case if the basis set is complete. The second term of the HF 
force correction {(p^j^ldT /dRi\(j)i,) may be non-zero if the kinetic energy is position dependent, 
e.g. if due to the use of a mixed basis set the calculated kinetic energy is discontinuous. 

Up to now the formulation for the total force Fj has remained completely general. In 
the following we will focus on the application of the outlined formalism within the FP-LAPW 
method. 



2.3 LAPW Method 

In the augmented plane-wave (APW) methods space is divided into the interstitial region (IR) 
and non-overlapping muffin-tin (MT) spheres centered at the atomic sites ]|]. By this the atomic- 
like character of the wavefunctions, potential, and electron density close to the nuclei can be 
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described accurately as can be the smoother behavior of these quantities in between the atoms. 
In the IR the basis set consists of plane waves exp{iKf). The choice of a computationally efficient 
and accurate representation of the wavefunctions within the MT spheres has been discussed by 
several authors, e.g. j^, In the original APW formulation introduced by Slater |l|, |2| 

the plane-waves are augmented to the exact solutions of the Schrodinger equation within the 
MT at the calculated eigenvalues. This approach is exact but computationally very expensive 
because it leads to an explicit energy dependence of the Hamilton and overlap matrices. Instead 
of performing a single diagonalization to solve the KS equation one repeatedly needs to evaluate 
the determinant of the secular equation (|l^ in order to find its zeros and thus the single particle 
eigenvalues e^. 

In the linearized APW the difficulty is removed by using a fixed set of suitable MT radial 
functions |0, Within Andersen's approach, used also in the WIEN code, radial solutions 

uj (e[, r/) of the KS equation at fixed energies ej and their energy derivatives iij {ej ,rj) are em- 
ployed. Basically, this choice corresponds to a linearization of uf (e, r) around ef The concept 
implies that the radial functions uj (e;) and iij (ei) and the respective overlap and Hamilton ma- 
trix elements need to be calculated only for a few energies ej . Moreover, all KS energies for 
one A:-point are found by a single diagonalization (for a detailed discussion see |14]). 

The LAPW basis functions </>^(r) which are used for the expansion of the KS wavefunctions 

^k,i(^)= E C,{K)c^g{r) (30) 

\K\<K^f 

are defined as 

f n-^/'^exp{iKr), f G IR 

"^^^""^^ E[«L(i?W(e[,r/) + 6L(i?)t'f(ef,r/)m™(r/), rj<sj . 

V Im 

Here, K = k + G denotes the sum of a reciprocal lattice vector G and a vector k within the 
first Brillouin zone. The wave function cutoff limits the number of these K vectors and 
thus the size of the basis set. The symbols in eq. (^) have the following meaning: $7 is the unit 
cell volume, sj is the MT radius, and fj = f — Rj is a vector within the MT sphere of the I-th 
atom. Note that Yimir) represents a complex spherical harmonic with ll_m(^) = (~l)™^im(^)- 
The radial functions ui{ei,r) and ui{€i,r) are solutions of the equations 

H'P''ui{ei,r)Yim{r) = eiUi{ei,r)Yim{f) (32) 
H'P''ui{^i,r)YUr) = [eM^i^r) + ui{ei,r)]YUf) (33) 
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which are regular at the origin. The operator H^^^ contains only the spherical average, i.e., the 
/ = component, of the effective potential within the MT. The ei should be chosen somewhere 
within that energy band with /-character. By requiring that value and slope of the basis func- 
tions are continuous at the surface of the MT sphere the coefficients aim{K) and him{K) are 
determined. 

The representation of the potentials and densities resembles the one employed for the wave 
functions, i.e., 

VSffexp(iGf), fGlR 
|G|<GP°t (34) 
Y.^ltA^l)yUri), ri<si . 

liri 

Thus, no shape approximation is introduced. The quality of this full-potential description is 
controlled by the cutoff parameter for the lattice vectors G and the size of the {l,m)- 

representation inside MTs. 



2.4 LAPW Forces 

The basis functions 4>j^{r) defined in eq. ( |3l| ) are centered at the nuclei positions Rj and thus 
move with the atoms. Furthermore, the single-particle kinetic energy is not continuous at the 
MT sphere boundaries where both types of basis functions are matched. Thus, an accurate force 
formalism has to deal with both matrix elements 

(^|//-e,|0^,) and (35) 
dui dui 

in the correction to the HF force in eq. (15). YSK derived expressions for both terms [|^, 

Independently, a successful formulation of the forces was found by Soler and Williams [p!9| , ^] 

who started from the kinetic energy functional T[n\ = fi J d'^ripi'V^ipi and employed a 

formulation of the potential energy U[n] introduced by Weinert p2| . 

In the following we briefiy summarize the YSK method. The force on the I-th. atom can be 

written as 

Fj = FfF + Ff°^'' + Ff ™i + F™i (36) 

where F™' {Ff^"-^^) combines the Pulay corrections due to valence (semicore) electrons while 
^core jgj^otes the respective core term. The different contributions are 

1 yes ( \ 

= Zj Y: lim^ '""''^ ' v,^[rjYUri)] (37) 

m=—l ^ 
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'val 



k,i K,K' 



(38) 
(39) 



The semicore correction Ff'^™^ is equivalent to eq. 3£. The evaluation of the HF force is 
straight forward in a FP-LAPW calculation because the electrostatic potential Vj^{r) is needed 
already for the evaluation of the KS effective potential V^^{r). Hence, we obtain 

I -y?ii{ri) + yr-i,i{rj) \ 



F. 



HF 



Zi\ — lim — 

V Svr rj^o rj 



\ 



(40) 



Alternatively to the approach of YSK, the core correction Fj°'^'^ in eq. (|3^ ) can also be deduced 
via eq. (^). Within the WIEN code the core electron density n'^°^^{r) is calculated using only 
the spherical part of the Hamiltonian. Hence, the core wavefunctions of the KS equation can 
be viewed as an (incomplete) set of spherical basisfunctions </)™''*'(f). The derivative of these 
functions with respect to the atomic position Rj is given by 



Thus, the relevant matrix elements in the incomplete basis set correction in eq. (| 
written as 



Re(-V 



core I 



H 



d^r<t>T' *mr{H - 6,) C-(r) 



/ 



dV-V,-{0r' *(r) [H - e,] 0r'(f)} 



(41) 
can be 

(42) 
(43) 



which leads to eq. (^) if we take into account that V^(-ff — ej) = VfV^^{'r) and choose the inte- 
gration boundaries for the integral in eq. (^) at the MT sphere boundaries where the functions 
6^°'^^ vanish. 



The terms {(j)j^i\H — ei\(f)j^)i in eq. ( |39D are given by the overlap and Hamilton matrix ele- 
ments. Naturally, the sum . has to be executed after the determination of the KS eigenvalues 
and the occupation numbers fj:-. We are left with the integrals in eqs. ( |3^ ) and (|39|). They can 
be derived from the general case [see eq. (A5) in YSK] 



d^rVrifir)Yr^{fMr)Yi 
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f{r)g{r] 



which is calculated using for the first term on the right-hand side of eq. (44): 
d 



/(/ + m + l)(/ + m + 2) 
+^ V (2/ + l)(2/ + 3) 



/, ,N (I — m)(l — m — 1) ^ , 



/ (/-m + l)(/-m + 2) 
V (2/ + l)(2/ + 3) 



(; + m)(/ + m-l) 
+ (2/ + l)(2/-l) ^^-i'-i(^) 



+ (^ + 1) 



(2/ + l)(2^ + 3) 
(/ + m)(/ — m) 



l—l,m 



(f) . 



(44) 



(45) 



(2/ + 1)(2/ - 1) 

The spherical integrals in the second part of eq. (^) and in the surface integral in the second line 
of eq. ( |39D can be evaluated by transforming them into Gaunt integrals of the form / ^*m'^im"^«m 
11. 



3 Structure Optimization 

We are now in a position to minimize the total energy E^°^{R) of a system with M independent 
atoms with respect to 3Af-dimensional position vector R = (Ri, R2, Rm) using the directly 
calculated force F = —dE^°^/dR. The simplest minimization scheme is to choose the next 
geometry step always along the force direction (steepest descent). This method can be inefficient 
if the Born-Oppenheimer surface happens to be a long and narrow valley. In order to avoid 
oscillations within such a valley one should take the previous minimization history into account. 
This is, for example, accomplished by using one of the following two procedures, the variable 
metric method or the damped Newton dynamics scheme. 
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3.1 Variable Metric Method 

If the total energy surface close to a geometry {R*} is well-described by a quadratic approxi- 
mation 

E^°\R) = E^°\R*)-F{R*)-{R-R*) + ^{R-R*)-AiR*)-^ , (46) 

where A{R*) is the Hessian matrix, the variable metric or quasi-Newton method provides a 
very efficient minimization. The derivative of eq. ( |46| ) with respect to R leads to the following 
expression for the force F{R) 

F{R) = F{R*) - A{R*) • {R - R*) . (47) 

Looking for the minimum of E^°^(R) means searching for a zero of this force. Hence, we have 

AR* = R- R* = A'\R*) ■ F{R*) . (48) 

The left side describes the finite step AR* which points into the minimum provided the inverse 
Hessian A^^{R*) and quadratic approximation of E^°^{R) are exact. Usually, this is not the 
case and we face two serious problems: An exact inverse hessian is not available and AR* from 
eq. ( ^8[ ) may not direct us in a downhill direction if higher order terms dominate the description 
of the energy surface. 

Fortunately, these difficulties can be removed by applying an algorithm developed by Broy- 



den, Fletcher, Goldfarb, and Shanno (BFGS) p^ , |45| , 46|. Its realization within the variable 
metric method as a FORTRAN program is described in Ref . [^] . The method iteratively builds 
up an approximation of A^^{R*) by making use of the forces obtained during previous steps of 
the structure optimization. This is done in such a way that the matrix remains positive definite 
and symmetric. This guarantees that E^°^{R) decreases initially as we move into the direction 
AR*. So, if the attempted step leads to an increase of the total energy, i.e., AR* is too large, 
one just has to backtrack trying smaller steps along the same direction in order to obtain a lower 
energy. The minimization process terminates when all atomic forces for a geometry fall below a 
certain limit. 

3.2 Damped Newton Dynamics and Molecular Dynamics 

The variable metric method works well if the energy surface description is dominated by quadratic 
terms, e.g. close to a total energy minimum. However, if the quadratic approximation in eq. (|4^) 
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is not well founded an algorithm based on damped Newton dynamics is more robust and efficient 



|48|. In our approach we use for the time evolution of an arbitrary atomic coordinate Rm the 



finite difference equation 

where RJ^ and F.^ are the coordinate and the respective force at time step r. Note that the 
minimization always includes implicitly the history of displacements stored in the "velocity" 
coordinate (i?^ — i?^^). Damping and speed of motion are controlled by the two parameters rjm 
and 5m- An optimum choice of these two quantities would provide for a fast movement towards 
the closest local minimum on the Born-Oppenheimer surface and suppress oscillations around 
this minimum. A small damping factor r]m improves the stability of the atomic relaxation while 
a larger value allows for energy barriers to be overcome and thus to escape from local minima. 
Again, the relaxation continues until all force components are smaller in magnitude than a 
certain limit. Obviously, if the damping is switched off, i. e. 7?m = 1 the approach equals an 
ab-initio molecular dynamics method. 



4 Examples 

In the following we present two test cases, the free H2 molecule and the hydrogen atom as an 
adsorbate on the (110) surface of bcc Mo. The purpose of our study here is mainly to point out 
the agreement between our forces and numerical derivatives of the total energy. Also, we like to 
demonstrate the efficiency of our structure optimization. 

The free H2 molecule in the first example is modeled in a cubic unit cell with a side length of 
10 bohr. For the xc-potential we employ the generalized gradient approximation (GGA) |^9[. We 
choose a MT radius of 0.65 bohr, and for the LAPW basis set we use radial functions in the MT 
spheres up to /max = 8 and a plane wave basis expansion in the interstitial region up to (ET™^)^ = 
12 Ry. The {I, in) expansion for the potential goes up to Zmax = 4. Because of the small hydrogen 
MT radius a relatively high plane- wave cutoff energy £;cut,pot _ ^Qpot^"^ _ ]^ggj^y necessary 
in order to obtain a converged interstitial representation of the potential. In Fig.[l| we present 
directly calculated forces (solid dots) and compare them to forces obtained from a polynomial 
fit to the respective total energies (solid lines). The results demonstrate the excellent agreement 
between both data sets. If the potential cutoff is too small (^cut,pot ^ (G'pot)2 ^ 81 Ry) the 
directly (empty dots) and indirectly (dashed lines) evaluated forces differ considerably from each 
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1.30 1.35 1.40 1.45 1.50 1.55 1.60 



H-H distance (bohr) 

Figure 1: Atomic forces for a H-dimer as a function of the H-H interatomic distance. The 
data points represent the directly calculated forces on the H-atom while the lines stem from 
polynomial fits to the total energy. 
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Figure 2: Side (left) and top view (right) of the fuhy relaxed 5-layer H/Mo (110) system. The 
solid and empty circles represent Mo- and H-atoms, respectively. 

other. To our knowledge no other element besides hydrogen exhibits such a high sensitivity of 
the calculated atomic forces to the interstitial representation of the potential. As will be shown 
in the second example the cutoff parameter is also less critical for hydrogen if a larger MT 
radius can be chosen. 

As a second case we present the example of a structure optimization using the BFGS- 
minimization algorithm. The goal is to find the relaxed geometry of the Mo (110) surface covered 
with a full monolayer of hydrogen. This problem is of particular interest because it was suspected 
that the hydrogen adsorption induces a so-called top-layer-shift reconstruction, i.e., a shift of 
the Mo surface layer along the [TlO] direction relative to the bulk |5C]. The substrate surface is 



modeled by a five layer slab repeated periodically and separated by 16.6 bohr of vacuum. Details 
of the geometry are shown in Fig. ^. The calculated in-plane lattice constant of 5.91 bohr is used. 
The fc-integrations are evaluated on a mesh of 64 equally spaced points in the surface Brillouin 
zone. The MT radii are chosen to be 2.40 bohr and 0.90 bohr for Mo and H, respectively. The 



15 



kinetic-energy cutoff for the plane wave basis needed for the interstitial region is set to 12 Ry, 
and the {l,m) representation (inside the MTs) is taken up to /max = 8 for both Mo and H. 
Here, the hydrogen MT radius is relatively large compared to that used for the study of the 
H-dimer. Therefore, a plane-wave cutoff energy of 64 Ry for the representation of the potential 
is sufficient. The maximum angular momentum of the MT {l,m) expansion of the potential is 
set to /max = 4. All states are treated non-relativistically. 

In the beginning adsorbate, surface, and subsurface atoms are distorted along the [110] 
direction with respect to the clean (110) surface configuration in order to break the mirror 
symmetry of the clean surface. During the relaxation process visualized in Fig.^ all atoms 
are allowed to move freely perpendicular to the surface and parallel along the [ll0]-direction. 
The system is considered to be in a stable or (metastable) geometry when all force components 
are smaller than 3mRy/bohr. The error in the structure parameters of the relaxed system is 
±0.02 bohr. 

During the optimization process the hydrogen relaxes into a quasi threefold position with a 
|TlO]-offset of uu = 1.19 bohr from the long-bridge position and a height of c/h-Mo = 2.04 bohr 
above the surface layer. For the surface layer we find a relaxation of 3 it 0.3 % of the bulk 
inter-layer spacing. Furthermore, the surface atoms are slightly distorted by 0.09 bohr along 
the [TlO]-direction opposite to the hydrogen with respect to the subsurface. Thus, there is no 
evidence for a pronounced top-layer-shift reconstruction. The subsurface layer relaxes to its bcc 
lattice position. 



5 Structure of the Program 
5.1 Force Calculation 



The force implementation fhi95force was developed for WIEN |13] as well as for WIEN93 [^] 
which is a considerably improved version of the original package. Furthermore, our program is 
now part of the latest update called WIEN95 [51|. Our intention was to keep the changes related 



to the original code and the input-files as small as possible. The program and running structure 
of the program were kept unchanged. In the following, we summarize the basic features of the 
force computation within WIEN: 

F]^^ : lapwO 

The electrostatic potential is determined on a logarithmic radial mesh. For the calculation 
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Figure 3: H-Adsorption on Mo(llO): Minimization of the total energy for a slab with five Mo- 
layers. Shown are the distances (in bohr) of the adsorbate and surface atom positions from the 
optimized structure (left). The respective forces (right) are given in mRy/bohr. The structure 
parameters yn, zn, yi, and zi are defined in Fig.|2[ 
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of Fj^^ according to eq. ( ^0|) we use the first radial mesh point rjfi. Our experience is that 
Fj^^ behaves numerically stable close to the MT center, i.e., one could also choose r/.i or 
r/^s. Also, the influence of the logarithmic radial mesh chosen is uncritical. The resulting 
Hellmann-Feynman force Fj^^ is written to unit 70 (case.fhf). 

Pp^ : lapwl & lapw2 

The determination of the correction FJ'^^ is almost completely done in the subroutine 
12f or (of lapw2) which resembles the original WIEN-subroutine 12main plus the FORTRAN 
translation of the YSR eqs. (A12), (A17), and (A20). For the evaluation of / dr^p{r)W{r) 
the subroutine fvdrho is called. The only additional input needed is related to the non- 
spherical matrix elements {uii\V,^\ui) , {uii\Vi^\ui), {uii\V,^\ui) , and {uii\V,y\ui) in eq. (A20) 
of YSR. They are calculated in program lapwl as a part of the hamiltonian setup and 
transferred to lapw2 using the input/output unit 71 (case.nsh(s)). The final result 
Fj'^^ is written to unit 72 (case.fval or case.fsc). The force calculation which is only 
executed if the switch FOR instead of TOT is set in case . in2 (s) roughly triples the running 
time of lapw2. 

: core 



The core correction in eq. (38) is calculated by the routine f core using eq. (^^. Before 
that the non-spherical potential has to be read from unit 19 case.vns. Note that pcore(?^) 
is spherical. Therefore, only the effective potential parts VJ^(r) with / = 1 have to be 
taken into account. The subroutine fcore uses unit 73 (case. f cor) for the output of 

^core 

Fj : mixer 

The final step which is the summation of all (available) partial forces according to eq. (|^) is 
done by mixer. This executable writes the accumulated information to unit 70 (case . f tot) 
and unit 80 (case . f inM). The latter can be used as input file for the minimization program 
mini. 

Note that all forces written to the output are in Ry/bohr. Only the calculation of F™' can be 
switched on and off the reason being that the computer time due to Fj^^ , F^^'^ , and the output 
of the non-spherical matrix elements in lapwl is negligible. The partial forces Fj^^ and Ff""^^ are 
also excellent indicators to monitor the convergence of F™' as well as the total atomic force Fj. 
Therefore, it is convenient to run a self-consistent calculation until FJ^^ or Fj°'^'^ are converged 
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NEWT 

0.7 2 2 
0.7 1 1 1 



minimization modus : BFGS/NEWT 
eta, delta(l-3) of atom 1 
eta, delta(l-3) of atom 2 



Table 1: Input file unit 5 (case.inM) 

and then evaluate F™' as a final step. In this way the additional computing time necessary for 
the atomic forces can be kept at a very low level. 

5.2 Minimization 

The program mini is executed by invoking the UNIX command 'mini < mini.def ' in the case- 
directory. As a first step, the minimization option and the control parameters are read from 
unit 5 (see Tabled). Then the previous minimization history from unit 16 (case .tmpM) is used to 
internally generate the approximate inverse Hessian matrix (the velocity coordinate) if the BFGS 
formalism (damped Newton dynamics scheme) is employed. Now, the calculated total energy 
and forces are read from unit 15 (case.finM) and used to determine the next trial step. If the 
new geometry causes an overlap of MT spheres a smaller step along the same direction leading 
to touching MTs is chosen. Finally, the new trial geometry as well as the updated minimization 
history are written to unit 21 (case . structl) and unit 16 (case.tmpM), respectively. 

At this point, the FP-LAPW calculation using WIEN for the new geometry can take place. 
The procedure continues until no further energy minimization can be achieved or all atomic 
forces fall below a certain minimum. 

6 Installation 

The program package fhi95f orce is written in FORTRAN?? and should work on all machines 
where WIEN can be installed. Because changes of the original code have been kept at a minimum 
the adaptation of f hi95f orce to other versions of WIEN and to similar FP-LAPW codes should 
cause no problems. The extension and the data files are contained in a single tar archive called 
fhi95f orce .tar. The extraction on a UNIX machine should generate a directory entitled 
fhi95f orce with the following subdirectories: 
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• SRC_force 

This directory contains the source files of the subroutines which have to be added to the 
existing code, e.g. Iapw0_f . f is the extension for lapwO. (Sub)routines marked with an 
asterisks in Table|2| already exist in the original WIEN code and have to be removed before 
compiling the program. If the local version of WIEN differs from the original one it should 
be very easy to update the existing WIEN-routines (lapwO, atpar, lapw2, hf sd, and mixer) 
by hand. The (few) changes necessary for the force calculations are framed by FORTRAN 
comment lines (cfb at the beginning and cfe at the end). The compilation with make 
can be done using existing makefiles which, nevertheless, have to be updated with the 
additional source files. Before starting a calculation with the new executables it will also 
be necessary to update the input/output channels according to Table ^. 

• SRC_force93 

This is the WIEN93-version of SRC_f orce. 

• SRC_mini 

Here the source for the minimization program can be found. Calling the UNIX command 
make within the directory creates the executable mini. 

• Mo 

This case directory enables the study of the H-point phonon of bcc-Mo similar to the frozen- 



phonon calculations presented in |20|. It can also be used for testing the minimization 
program mini. 



7 Test run 

We recommend to run the Mo test case at the beginning with the already existing local version 
of WIEN, check the output for warnings and error messages, and than repeat the procedure with 
the updated WIEN- version setting the FOR switch in Mo . in2 and Mo . in2s. 

Then, a step-by-step minimization running alternately WIEN and the new program mini can 
take place. If this one-dimensional relaxation leads successfully from the distorted to the equi- 
librium bcc configuration, one may go on to structure optimizations with higher dimensionality. 
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source 


routine 


relevance for the force calculation 


lapwO_f . f 


* lapwO 


calculates and writes Hellmann-Feynman force to unit 70 


lapwl_f .f 


* atpar 


writes non-spherical matrix elements to unit 71 


lapw2_f . f 


charg2 
dfrad 
* lapw2 
12for 

mag 
sevald 
spline 
vdrho 


modification of charge 

obtains radial derivative of a radial function 
calls 12f or if switch FOR is set 

evaluates and writes valence (semi-core) partial forces to unit 72 

obtains magnitude of a 3-dim vector 
computes first derivative of a spline 
obtains coefficients for cubic interpolation spline 
calculates / (firV (r) Vp(r) 


core_f . f 


charge 
dfrad 
f core 
* hfsd 
sevald 
spline 


does Simpson integration inside a sphere 
see lapw2_f . f 

calculates core correction of force and writes result to unit 73 
calls f core 
see lapw2_f . f 
see lapw2_f . f 


mixer _f . f 


* mixer 
totf or 


calls totf or 

calculates and writes total force to unit 70 


mini . f 


dfpmin 

finish 

func 

haupt 

inv 

latgen 
Insrch 

meixstp 

nwtmin 
pairdis 
rotate 
rotdef 


minimizes a multi-dimensional function using the BFGS variable 
metric method 
writes final output 

reads total energy and atomic forces from unit 15 
handles input and output; calls dfpmin or nwtmin 

evaluates inverse of a matrix 
defines lattice (basis vectors) 

searches for a lower value of a multi-dimensional function along 
the search direction 

determines maximum possible atomic displacement without 
overlap of MTs 

minimizes total energy using damped Newton dynamics 
calculates pair distance of atoms 
rotates a vector 

selects symmetry operations of equivalent atoms 



Tabic 2: Summary of the source files contained in tlie directory SRC_force. The routines 
marked with an asterisk already exist in WIEN and have to be updated or substituted by the new 
versions. 
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executable 


unit 


I/O 


file- name 


comment 


lapwO 


70 





case 


fhf 


HcUmann-Feynman force 


lapwl 


71 


o 


case 


nsh(s) 


non-spherical matrix elements 


lapw2 


2 


I 


case 


in2(s) 


input file (switch for force-calculation) 
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I 


case 


vns 


non-spherical potential 




71 


I 


case 


nsh(s) 


non-spherical matrix elements 




72 





case 


f [val/sc] 


partial forces due to valence (semi-core) electrons 


core 


19 


I 


case 


vns 


non-sphcrical potential 




73 





case 


f cor 


partial force due to core electrons 


mixer 


70 


o 


case 


ftot 


sum of all available partial forces 




71 


I 


case 


fhf 


Hellmann-Feynman force 




72 


I 


case 


fval 


partial force due to valence electrons 




73 


I 


case 


f sc 


partial force due to semi-core electrons 




74 


I 


case 


f cor 


partial force due to core electrons 




80 





case 


finM 


input file for mini 


mini 


5 


I 


case 


inM 


input-file 




6 


o 


case 


outputM 


general output-file 




15 


I 


case 


finM 


total energy and atomic forces for current 

geometry 




16 


I/O 


case 


tmpM 


history of minimization 




20 


I 


case 


struct 


current struct-file 




21 


o 


case 


struct 1 


struct-file with new trial geometry 



Table 3: Input and output-files relevant for the force calculation and the minimization proce- 
dure. 
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Test run 



Output file Mo. force (steirting configuration) 



1. CYCLE 





mu 


F 


IFI 


Fx 


Fy 


Fz 


1 


1 


»> 


.OOOOOOOE+00 


.OOOOOOOE+00 


.OOOOOOOE+00 


.OOOOOOOE+00 


2. 


.CYCLE 












ja 


mu 


F 


IFI 


Fx 


Fy 


Fz 


1 


1 


H-F 


.3093266E+01 


. OOOOOOOE+00 


. OOOOOOOE+00 


-.3093266E+01 


1 


1 


VAL 


.3457794E-01 


. 8271806E-24 


. 3308722E-23 


.3457794E-01 


1 


1 


VAL 


. 1951423E+00 


-.3642919E-16 


-.563785 IE- 17 


. 1951423E+00 


1 


1 


COR 


.2423779E+01 


. OOOOOOOE+00 


. OOOOOOOE+00 


.2423779E+01 


1 


1 


»> 


.4397668E+00 


-.3642919E-16 


-.5637848E-17 


-.4397668E+00 


11. 


.CYCLE 












ja 


mu 


F 


IFI 


Fx 


Fy 


Fz 


1 


1 


H-F 


. 1621001E+01 


. OOOOOOOE+00 


. OOOOOOOE+00 


-.1621001E+01 


1 


1 


VAL 


. 1334964E-01 


. 1654361E-23 


. 6203855E-24 


. 1334964E-01 


1 


1 


VAL 


.6451487E-01 


-.3089976E-17 


-.4201283E-16 


.6451487E-01 


1 


1 


COR 


. 1409447E+01 


. OOOOOOOE+00 


. OOOOOOOE+00 


. 1409447E+01 


1 


1 


»> 


. 1336895E+00 


-.3089974E-17 


-.4201283E-16 


- . 1336895E+00 


12. 


.CYCLE 












ja 


mu 


F 


IFI 


Fx 


Fy 


Fz 


1 


1 


H-F 


.1621459E+01 


. OOOOOOOE+00 


. OOOOOOOE+OO 


-.1621459E+01 


1 


1 


VAL 


. 1335570E-01 


-.4135903E-24 


-.1240771E-23 


. 1335570E-01 


1 


1 


VAL 


.6455603E-01 


.3474868E-16 


-.9199455E-16 


.6455603E-01 


1 


1 


COR 


. 1409757E+01 


. OOOOOOOE+00 


. OOOOOOOE+00 


. 1409757E+01 


1 


1 


»> 


. 1337903E+00 


.3474868E-16 


-.9199455E-16 


- . 1337903E+00 



Output file Mo.tmpM (after three geometry steps) 



3 

3 



-93.75512 


1 










147750000000D+01 


. OOOOOOOOOOOOD+00 


1 


.4775 


.0000 


.2500 


147750000000D+01 


. OOOOOOOOOOOOD+00 


1 


.4775 


.0000 


.2500 


159570000000D+01 


.668951500000D-01 


1 


.5957 


.0669 


.2700 


-93.76944 


2 










147750000000D+01 


. OOOOOOOOOOOOD+00 


1, 


.4775 


.0000 


.2500 


147750000000D+01 


. OOOOOOOOOOOOD+00 


1, 


.4775 


.0000 


.2500 


152880471000D+01 


.284468350000D-01 


1, 


.5288 


.0284 


.2587 


-93.77203 


3 










147750000000D+01 


. OOOOOOOOOOOOD+00 


1, 


.4775 


.0000 


.2500 


147750000000D+01 


. OOOOOOOOOOOOD+00 


1, 


.4775 


.0000 


.2500 


145353140400D+01 


-.133528500000D-01 


1, 


.4535 


-.0134 


.2459 


. 0000000 


4 










147750000000D+01 


.OOOOOOOOOOOOD+00 


1 


.4775 


.0000 


.2500 


147750000000D+01 


. OOOOOOOOOOOOD+00 


1, 


.4775 


.0000 


.2500 


144430222800D+01 


. OOOOOOOOOOOOD+00 


1, 


.4443 


.0000 


.2444 



